AITopics

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.68)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Neural Information Processing SystemsFeb-12-2026, 15:23:33 GMT

Unsupervised Text Style Transfer using Language Models as Discriminators

Zichao Yang, Zhiting Hu, Chris Dyer, Eric P. Xing, Taylor Berg-Kirkpatrick

Neural Information Processing Systems http://nips.cc/

arxiv preprint arxiv, discriminator, language model, (13 more...)

Country:

Europe > Germany > Saxony > Leipzig (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-9-2026, 07:34:32 GMT

8c00dee24c9878fea090ed070b44f1ab-Paper.pdf

machine learning, natural language, translation, (17 more...)

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > Canada (0.04)
Europe > Finland > Southwest Finland > Turku (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Johnny, Samuel Ebimobowei, Guda, Blessed, Aaron, Emmanuel Enejo, Gueye, Assane

Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture

arXiv.org Artificial IntelligenceDec-10-2025

Automatic Sign Language Recognition (ASLR) has emerged as a vital field for bridging the gap between deaf and hearing communities. However, the problem of sign-to-sign retrieval or detecting a specific sign within a sequence of continuous signs remains largely unexplored. We define this novel task as Sign Language Spotting. In this paper, we present a first step toward sign language retrieval by addressing the challenge of detecting the presence or absence of a query sign video within a sentence-level gloss or sign video. Unlike conventional approaches that rely on intermediate gloss recognition or text-based matching, we propose an end-to-end model that directly operates on pose keypoints extracted from sign videos. Our architecture employs an encoder-only backbone with a binary classification head to determine whether the query sign appears within the target sequence. By focusing on pose representations instead of raw RGB frames, our method significantly reduces computational cost and mitigates visual noise. We evaluate our approach on the Word Presence Prediction dataset from the WSLP 2025 shared task, achieving 61.88\% accuracy and 60.00\% F1-score. These results demonstrate the effectiveness of our pose-based framework for Sign Language Spotting, establishing a strong foundation for future research in automatic sign language retrieval and verification. Code is available at https://github.com/EbimoJohnny/Pose-Based-Sign-Language-Spotting

artificial intelligence, machine learning, natural language, (17 more...)

2512.08738

Country:

North America > United States (0.15)
Europe > Spain (0.14)
Africa > Rwanda (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Thomas, Marshall, Fish, Edward, Bowden, Richard

SignBind-LLM: Multi-Stage Modality Fusion for Sign Language Translation

arXiv.org Artificial IntelligenceDec-5-2025

Despite progress in gloss-free Sign Language Translation (SLT), traditional single modality end-to-end approaches consistently fail on two critical components of natural signing: the precise recognition of high-speed fingerspelling and the integration of asynchronous non-manual cues from the face. Recent progress in SLT with Large Language Models has side stepped this challenge, forcing a single network to learn these simultaneously resulting in poor performance when tasked with translating crucial information such as names, places, and technical terms. We introduce SignBind-LLM, a modular framework designed to overcome these limitations. Our approach employs separate, specialized predictors for continuous signing, fingerspelling, and lipreading. Each expert network first decodes its specific modality into a sequence of tokens. These parallel streams are then fused by a lightweight transformer that resolves temporal misalignments before passing the combined representation to a Large Language Model (LLM) for final sentence generation. Our method establishes a new state-of-the-art on the How2Sign, ChicagoFSWildPlus, and BOBSL datasets with a BLEU-4 score of 22.1, 73.2% letter accuracy and BLEU-4 score of 6.8 respectively. These results validate our core hypothesis: isolating and solving distinct recognition tasks before fusion provides a more powerful and effective pathway to robust, high-fidelity sign language translation.

large language model, machine learning, translation, (18 more...)

2509.0003

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education > Curriculum > Subject-Specific Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rubaiyeat, Husne Ara, Mahmud, Hasan, Hasan, Md Kamrul

Bangla Sign Language Translation: Dataset Creation Challenges, Benchmarking and Prospects

arXiv.org Artificial IntelligenceNov-27-2025

Bangla Sign Language Translation (BdSLT) has been severely constrained so far as the language itself is very low resource. Standard sentence level dataset creation for BdSLT is of immense importance for developing AI based assistive tools for deaf and hard of hearing people of Bangla speaking community. In this paper, we present a dataset, IsharaKhobor , and two subset of it for enabling research. We also present the challenges towards developing the dataset and present some way forward by benchmarking with landmark based raw and RQE embedding. We do some ablation on vocabulary restriction and canonicalization of the same within the dataset, which resulted in two more datasets, IsharaKhobor_small and IsharaKhobor_canonical_small. The dataset is publicly available at: www.kaggle.com/datasets/hasanssl/isharakhobor [1].

artificial intelligence, machine learning, natural language, (16 more...)

2511.21533

Genre: Research Report (0.51)

Industry: Education > Curriculum > Subject-Specific Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsNov-20-2025, 15:46:57 GMT

Unsupervised Text Style Transfer using Language Models as Discriminators

Zichao Yang, Zhiting Hu, Chris Dyer, Eric P. Xing, Taylor Berg-Kirkpatrick

Unsupervised text style transfer requires learning disentangled representations of attributes (e.g., negative/positive sentiment, plaintext/ciphertext orthography) and underlying content.

artificial intelligence, machine learning, natural language, (16 more...)

Country:

Europe > Germany > Saxony > Leipzig (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Yazdani, Shakib, Hamidullah, Yasser, España-Bonet, Cristina, van Genabith, Josef

Seeing, Signing, and Saying: A Vision-Language Model-Assisted Pipeline for Sign Language Data Acquisition and Curation from Social Media

arXiv.org Artificial IntelligenceOct-30-2025

Most existing sign language translation (SLT) datasets are limited in scale, lack multilingual coverage, and are costly to curate due to their reliance on expert annotation and controlled recording setup. Recently, Vision Language Models (VLMs) have demonstrated strong capabilities as evaluators and real-time assistants. Despite these advancements, their potential remains untapped in the context of sign language dataset acquisition. To bridge this gap, we introduce the first automated annotation and filtering framework that utilizes VLMs to reduce reliance on manual effort while preserving data quality. Our method is applied to TikTok videos across eight sign languages and to the already curated YouTube-SL-25 dataset in German Sign Language for the purpose of additional evaluation. Our VLM-based pipeline includes a face visibility detection, a sign activity recognition, a text extraction from video content, and a judgment step to validate alignment between video and text, implementing generic filtering, annotation and validation steps. Using the resulting corpus, TikTok-SL-8, we assess the performance of two off-the-shelf SLT models on our filtered dataset for German and American Sign Languages, with the goal of establishing baselines and evaluating the robustness of recent models on automatically extracted, slightly noisy data. Our work enables scalable, weakly supervised pretraining for SLT and facilitates data acquisition from social media.

artificial intelligence, machine learning, natural language, (17 more...)

2510.25413

Country:

Europe > Spain (0.46)
Europe > Germany (0.28)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Hamidullah, Yasser, Yazdani, Shakib, Oguz, Cennet, van Genabith, Josef, España-Bonet, Cristina

SONAR-SLT: Multilingual Sign Language Translation via Language-Agnostic Sentence Embedding Supervision

arXiv.org Artificial IntelligenceOct-23-2025

Sign language translation (SLT) is typically trained with text in a single spoken language, which limits scalability and cross-language generalization. Earlier approaches have replaced gloss supervision with text-based sentence embeddings, but up to now, these remain tied to a specific language and modality. In contrast, here we employ language-agnostic, multimodal embeddings trained on text and speech from multiple languages to supervise SLT, enabling direct multilingual translation. To address data scarcity, we propose a coupled augmentation method that combines multilingual target augmentations (i.e. translations into many languages) with video-level perturbations, improving model robustness. Experiments show consistent BLEURT gains over text-only sentence embedding supervision, with larger improvements in low-resource settings. Our results demonstrate that language-agnostic embedding supervision, combined with coupled augmentation, provides a scalable and semantically robust alternative to traditional SLT training.

artificial intelligence, natural language, translation, (16 more...)